CLEAN rewards for improving multiagent coordination in the presence of exploration

نویسندگان

Chris HolmesParker

Adrian K. Agogino

Kagan Tumer

چکیده

In cooperative multiagent systems, coordinating the jointactions of agents is difficult. One of the fundamental difficulties in such multiagent systems is the slow learning process where an agent may not only need to learn how to behave in a complex environment, but may also need to account for the actions of the other learning agents. Here, the inability of agents to distinguish the true environmental dynamics from those caused by the stochastic exploratory actions of other agents creates noise on each agent’s reward signal. To address this, we introduce Coordinated Learning without Exploratory Action Noise (CLEAN) rewards, which are agent-specific shaped rewards that effectively remove such learning noise from each agent’s reward signal. We demonstrate their performance with up to 1000 agents in a standard congestion problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Counterfactual Exploration for Improving Multiagent Learning

In any single agent system, exploration is a critical component of learning. It ensures that all possible actions receive some degree of attention, allowing an agent to converge to good policies. The same concept has been adopted by multiagent learning systems. However, there is a fundamentally different dynamic in multiagent learning: each agent operates in a non-stationary environment, as a d...

متن کامل

Feature Selection as a Multiagent Coordination Problem

Datasets with hundreds to tens of thousands features is the new norm. Feature selection constitutes a central problem in machine learning, where the aim is to derive a representative set of features from which to construct a classification (or prediction) model for a specific task. Our experimental study involves microarray gene expression datasets; these are high-dimensional and noisy datasets...

متن کامل

Combining reward shaping and hierarchies for scaling to large multiagent systems

Coordinating the actions of agents in multiagent systems presents a challenging problem, especially as the size of the system is increased and predicting the agent interactions becomes difficult. Many approaches to improving coordination within multiagent systems have been developed including organizational structures, shaped rewards, coordination graphs, heuristic methods, and learning automat...

متن کامل

CLEANing the Reward: Counterfactual Actions Remove Exploratory Action Noise in Multiagent Learning

Coordinating the joint-actions of agents in cooperative multiagent systems is a difficult problem in many real world domains. Learning in such multiagent systems can be slow because an agent may not only need to learn how to behave in a complex environment, but also to account for the actions of other learning agents. The inability of an agent to distinguish between the true environmental dynam...

متن کامل

Exploiting structure and utilizing agent-centric rewards to promote coordination in large multiagent systems

A goal within the field of multiagent systems is to achieve scaling to large systems involving hundreds or thousands of agents. In such systems the communication requirements for agents as well as the individual agents’ ability to make decisions both play critical roles in performance. We take an incremental step towards improving scalability in such systems by introducing a novel algorithm tha...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

CLEAN rewards for improving multiagent coordination in the presence of exploration

نویسندگان

چکیده

منابع مشابه

Counterfactual Exploration for Improving Multiagent Learning

Feature Selection as a Multiagent Coordination Problem

Combining reward shaping and hierarchies for scaling to large multiagent systems

CLEANing the Reward: Counterfactual Actions Remove Exploratory Action Noise in Multiagent Learning

Exploiting structure and utilizing agent-centric rewards to promote coordination in large multiagent systems

عنوان ژورنال:

اشتراک گذاری